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1. Introduction 

Let Xn = (''^ij)i<i<„ i<j<m ^ matrix whose n rows form independent 
samples from some population distribution with mean vector /x„ and covariance 
matrix I]„. High dimensional data increasingly occur in modern statistical ap- 
plications in biology, finance and wireless communication, where the dimension 
m may be comparable to the number of observations n, or even much larger 
than n. Therefore, it is necessary to study the asymptotic behavior of statistics 
of Xn under the setting that m = run grows to infinity as n goes to infinity. 

In many empirical examples, it is often assumed that = Im, where Im is 
the m X m identity matrix, so it is important to perform the test 

Ho ■ 5]„ = Im (1) 

before carrying out further estimation or inference procedures. Due to high 
dimensionality, conventional tests often do not work well or cannot be imple- 
mented. For example, when m > n, the likelihood ratio test (LRT) cannot be 
used because the sample covariance matrix is singular; and even when m < n, 
the LRT is drifted to infinity and lead to many false rejections if m is also large 
( Baiet al.l . [ioool ) . iLedoit anc Wold (|2002l ) found that the empirical distance test 
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( Nagaol . 1973 ) is not consistent when both m and n arc large. The problem has 
been studied by several authors under the "large n. large m" paradigm. Bai 



at al. ( 2009f ) aiid Ledoit and Woli l 2002f ) proposed corrections to the LRT and 
the empirical distance test res pectively. A s sumin g that the population distribu- 
tion is Gaussian with /x„ = 0. Ijohnstond (|200ll ) used the largest eigenvalue of 
the sample covariance matrix X^Xn as the test st atistic, and proved t hat it s 
limiting distribution follows the Tracy- Widom law (ITracv and Widoml . 1 19941 ). 
Here we use the superscript T to denote the trans pose of a matrix or a vector. 
His wo rk was extended to the non-Gaussian case bv lSoshnikov ( 2002 ) and Peche 
( 2009h . where they assumed the entries of Xn are independent and identically 
distributed (i.i.d.) with sub-Gaussian tails. 

Let Xi,X2, ■ ■ ■ ,Xm be the m columns of Xn- In practice, the entries of the 
mean vector /x„ are often unknown, and are estimated by Xi = {1/n) Y^2=i 



Write Xi — Xi for the vector Xi — Xiln, where 1„ is the 7i-dimensional vector 
with all entries being one. Let aij = Coy^Xu, Xij), 1 < i,j < m, be the 
covariance function, namely, the (z,j)th entry of S„. The sample covariance 
between columns a;, and x, is defined as 



(Ty = -{x,~ Xi^iXj - Xj). 

n 

In high-dimensional covariance inference, a fundamental problem is to establish 
an asymptotic distributional theory for the maximum deviation 



max I 

l<i<j<m 



With such a distributional theory, one can perform statistical inference for struc- 
tures of covariance matrices. For example, one can use Mn to test the null hy- 
pothesis Ho : S„ = S(°), where is a pre-specified matrix. Here the null 
hypothesis can be that the population distribution is a stationary process so 
that Tin is Toeplitz, or that I]„ has a banded structure. 

It is very challenging to derive an asymptotic theory for M„ if we allow 
dependence among Xn,. . . ,Xim. Many of the earlier results assume th at the 



entries of the data matrix Xn are i.i.d.. In this case crij = if i 7^ j. [Jiang 



(|2004f ) derived the asymptotic distribution of 



max I 

l<i<j<m 



Theorem 1 (jJiangj . 120041 ). Suppose Xij, i,j = 1,2,... are independent and 
identically distributed as ^ which has variance one. Suppose E|^|'^*'~'^ < co for 
any e > 0. Ifn/m c G (0, 00), then for any y e M, 



lim P (nLn — 41ogm + log(logm) + log(87r) < y) = exp 



Ji ang's work has attracted considerable attention, and been f(3llowed bv Li 



et al. (|2010l ). lLiu et all (|2008l ). IZhoul (|2007[ ) and lLi and Rosalskvl (|2006l ). Under 
the same setup that X„ consists of i.i.d. entries, these works focus on three 
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directions (i) reduce the moment condition; (ii) allow a wider range of p; and 
(iii) s how that some moment condition is necessary. In a recent article. Cai and 
Jiang ( 2OIII ) extended those results in two ways: fi) the dimension v could grow 



exponentially as the sample size n proyided exponential moment conditions; 
and (ii) they showed that the test statistic max|i_j|>5^ \<Tij\ also converges to 
the Gumbel distribution if each row of Xn is Gaussian and is s„-dependent. 
The latter generalization is important since it is one of the very few results that 
allow dependent entries. 

In this paper we shall show that a self-normalized version of M„ converges 
to the Gumbel distribution under mild dependence conditions on the vector 
{Xii, . . . ,Xim)- Thus our result provides a theoretical foundation for high- 
dimensional simultaneous inference of covariances. 

The rest of this article is organized as follows. We present the main result in 
Section [2I In Section [3j we use two examples on linear processes and nonlinear 
processes to demonstrate that the technical conditions are easily satisfied. We 
discuss three tests for the covariance structure using our main result in Section|4l 
The proof is given in Section [5l and some auxiliary results are collected in 
Section [51 



2. Main result 

We consider a slightly more general situation where population distribution 
can depend on n. Let X„ = {Xn,k,i)i<k<n,i<i<m be a data matrix whose n 
rows are i.i.d. m-dimensional random vectors with mean /x„ — {^J■n,i)l<i<m and 
covariance matrix E„ = {(7n,ij)i<ij<m- Let xi, X2, . . . , Xm be the m columns of 
Xn- Let Xi = (l/n) ^n.k.i, and write Xi — Xi for the vector Xi — Xiln- The 

sample covariance between Xi and Xj is defined as 

It is unnatural to study the maximum of a collection of random variables 
which are on different scales, so we consider the normalized version \^n,i,j ~ 
o-n,ij|/T„,ij, where 

TVi.i.j = Var [{Xn,lS — fin,i)iXn,l,j ~ Mnj")] • 

In practice, Tn^ij are usually unknown, and can be estimated by 

^n,i,j ~ I (^i ^i) ^ i'^j "^j^ ^n,i,j ' ln| 

where o denotes the Hadamard product defined as A o B := (aijbij) for two 
matrices A = (aij) and B = (bij) with the same dimensions. We thus consider 

M„= max (2) 

l<i<j<m A/Tn.ij 
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Due to the normalization procedure, we can assume without loss of generality 
that (T„_,;.i = 1 and /i„.i = for each 1 < i < m. 

Define the index set I„ = '■ 1 < « < .? < m}, and for a = («,.?) <= In, 

let Xn^a ■= Xn^i^iXn^ij. Define 

K.n{t,p)= sup Eexp(i|X„,i_i|P) , 
Mn{p)^ sup E(|X„,i.,|f), 

l<i<j<.m 

7„ = sup |Cor(X„.Q,X„,/3)| , 

a,,3el„ and a^/i 

7„(6) = sup sup inf |Cor(X„,a,X„,^)| . 

We need the following technical conditions. 
(Al). liminf t„ > 0. 
(A2). lim sup 7„ < 1. 

n 

(A3). jn{bn) ■ (log&n) = o(l) for any sequence (6„) such that 6„ — > oo. 
(A3'). 7„(6n) = o(l) for any sequence (6„) such that 6„ — > oo, and 

[Cov(X„ Q,, X„_^)]^ = 0(TO^~'^)for some constant e > 0. 



a. pel, 



(A4). logm = o In^'^^+'^P) j and limsup /C„(i,p) < oo for some constants 

t > and < p < 4. 
(A4'). m = 0{n'^) and limsup A^„(4f7 + 4 + (5) < oo for some constants 

q> and S > 0. 

The two conditions (A3) and (A3') require that the dependence among X„^q, a e 
In, are not too strong. They are translations of (Bl) and (B2) in Section 
(see Remark [2] for some equivalent versions), and either of them will make our 
results valid. We use (A2) to get rid of the case where they may be lots of pairs 
{a, (3) G In such that Xn,a and Xn,/3 are perfectly correlated. Assumptions (A4) 
and (A4') connect the growth speed of m relative to n and the moment con- 
ditions. They are typical in the context of high dimensional covariance matrix 
estimation. Condition (Al) excludes the case that Xn,a is a constant. 

Theorem 2. Suppose that JC„ = {Xn.k,i)i<k<n,i<i<m is a data matrix whose 
n rows are i.i.d. m-dimensional random vectors, and whose entries have mean 
zero and variance one. Assume (Al), (A2), either of (A3) and (A3'), and either 
of (A4 ) and (Ai'), then for any j/ G K, 



lim P (nM^ — 4 log m + log(log m) + log(87r) < y) = exp 
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3. Examples 

Except for (A4) and (A4'), which put conditions on every single entry of the 
random vector (^n,i,i)i<i<mi the other conditions of Theorem [2] are related 
to the dependence among these entries, which can be arbitrarily complicated. 
In this section we shall provide examples which satisfy the four conditions (Al), 
(A2), (A3) and (A3'). Observe that if each row of X„ is a random vector with 
uncorrelated entries (specifically, the entries arc independent), then all these 
conditions are automatically satisfied. They arc also satisfied if the number of 
non-zero covariances is bounded. 



3.1. Stationary Processes 



Suppose {Xn,k.i) = {Xk.i), and each row of {Xk,i)i<i<m is distributed as a 
stationary process {Xi)i<i<m of the form 

Xi = g(ei,e,_i,...) 

where e^'s are i.i.d. random variables, and g is a measurable function such 
that Xi is well-defined. Let (e^jez b e an i.i.d. copy of {ei)i^z, and Xl = 
g(ei, . . . , ei, Eq, e_i, e_2, . . ■). Following IWul (|2005r ). define the physical depen- 
dence measure of order p by 

Sp{i) = \\X^ - X'^Wp. 



Define the squared tail sum 



3=k 



1/2 



and use 'i'p as a shorthand for 5'p(0). 

We give sufficient conditions for (Al), (A2), (A3) and (A3') in the following 
lemma and leave its proof to the supplementary file. 

Lemma 3. (i) //O < *4 < oo and Va.T{XiXj) > for all i,j e Z, then (Al) 
holds. 

(a) If in addition, \CoT{XiXj, XkXi)\ < 1 for all i,j,k,l such that they are 

not all the same, then (A2) holds. 
(Hi) Assume that the conditions of (i) and (ii) hold. If'^p{k) = o(l/logfc) as 
k ^ oo, then (ki) holds. If Y1%q{'^ = 0{m^-^) for some 5 > 0, 
then (A3i' ) holds. 

Remark 1. Let g be a linear function with g{ei, fi-i, . . ■) = Sjlo ^j^i-j^ where 
Cj are i.i.d. with mean and E(|ej|^) < oo and Oj are real coefficients with 
'^l ^ Then the physical dependence measure 6p{i) = |ai|||eo — ^oIIp- If 
Qi = i~^£{i), where 1/2 < /3 < 1 and ^ is a slowly varying function, then (Xi) 
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is a long memory process. Smaller (3 indicates stronger dependence. Condition 
(iii) holds for all /3 e (1/2,1). Moreover, if a, = i-^/^{log{i))-'^ , i > 2, which 
corresponds to the extremal case with very strong dependence /3 = 1/2, we also 
have = 0((log fc)""^/^) = o(l/logfc). So our dependence conditions are 

actually quite mild. 

If (Xj) is a linear process which is not identically zero, then the following 
regularity conditions are automatically satisfied: ^'4 > 0, Var(XiXj) > for all 
i,j G Z, and | Cor{XiXj , XkXi)\ < 1 for all i,j,k,l such that they are not all 
the same. 

3.2. Non- stationary Linear Processes 

Assume that each row of (Xn.k.i) is distributed as {Xn,i)i<i<m, which is of the 
form 

tez 

where ei, i £ Z are i.i.d. random variables with mean zero, variance one and 
finite fourth moment, and the sequence {fn,i,t) satisfies X^tGZ fn i t — ^- Denote 
by K4 the fourth cumulant of eq- For 1 < i, j, k,l < m, we have 

^n,ij — ^ fn,i,i — tfnj,j — t^ 
Cov{Xn,iXnj , Xn^kXnj) ~ Cum{Xn,i , Xnj , Xn,k, Xn,l) + <Jn,i,k'^n.j,l + <^n,i,l<yn,j,k, 

where Cum(X„^i, is the fourth order joint cumulant of the ran- 

dom vector (X„_j, Xjij, X„_;)^, which can be expressed as 

Cum(X„^i, XnJ, Xn,k, Xn,l) ~ fn,i,i-tfn,j,j-t.fn,k,k-t.fn,l,l-tK4, 

by the multilinearity of cumulants. In particular, we have 

Var(X,X,) = 1 + + • ^ ^l,^,t^l,,,t■ 

Since K4 = Var(eQ) — 2 (Eeq)^ > —2, the condition 

K4 > -2 (3) 

guarantees (Al) in view of 

Nw{XiXj) > (1 + crf,,j,j)(l + min{K/2, 0}) > nun{l, 1 + k/2} > 0. 

To ensure the validity of (A2), it is natural to assume that no pairs X^^i and 
Xn^j are strongly correlated, i.e. 

lim sup sup 

n— >cxD l<i<j<m 



^ ^ fn,i,i — tfnj,j- 



< 1. 



(4) 
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Wc need the following lemma, whose proof is elementary and will be given in 
the supplementary file. 

Lemma 4. The condition ^ suffices for (A2) if ei 's are i.i.d. iV(0, 1). 
As an immediate consequence, when e^'s are i.i.d. N{0, 1), we have 
£ := limsupinf inf Var {Xn,iXn.j ~ pXn,kXn.i) > 0, 

where inf* is taken over all 1 < i,j,k,l < m such that i < j , k < I and 
ihj) 7^ Observe that when e^'s are i.i.d. iV(0, 1), 



Var {Xn,iXnJ — pXn,kXn,l) = 2 • ^'^{fn,i,i-tfn,j,j-t ~ Pfn,k,k-tfn,l,l-t)'^ (5) 

tGZ 

~t~ ^ ^ (^fn,i,i — tfn,j,j — s ~t~ fn,i,i—sfn,j,j — t 



s<t 



pfn,k,k — tfn,l,l — s P fn,k,k~s fn,l,l — t) : 

and when e^'s are arbitrary variables, the variance is given by the same formula 
with the number 2 in ^ being replaced by 2 + K4. Therefore, if (j3|) holds, then 

limsupinf inf Var {Xn,iXn,j — pXn,kXn,i) > min{l, 1 + K4/2} • ^ > 0, 

which implies (A2) holds. To summarize, we have shown that ([3|) and (j4|) suffice 
for (A2). 

Now we turn to Conditions (A3) and (A3'). Set 



/i„(fc) = jup I J2 fn 

M = lk/2\ 



l<i<m 



1 n,i.t 



where [xj = max{y e Z : y < x} for any a; e E, then we have 

Wn,^,3 \ < 2/l„(0)/l„(|i - j\) = 2h„{\t - j\). 

Fixing a subset {i,j}, for any integer 6 > 0, there are at most 8fo^ subsets {fc, 1} 
such that {k, 1} C B{i; b)LiB{j; b), where B{x] r) is the open ball {y : \x—y\ < r}. 
For all other subsets {fc, I}, we have 

|Cov(X„,,X„j,X„,fcX„,,)l < (4 + 2k4)/i„(6), 

and hence (A3) holds if we assume ft,„(A:„) log kn = o(l) for any positive sequence 
(fc„) such that kn — > 00. (A3') holds if we assume 

m 

Y,[hn{kr^O{m'-'). 
fe=i 

for some S > 0, because 

\C0Y{Xn.^Xn.J,X„MXn,l)\ < 2KiK{\i - j\) + 2/i„(|i - k\) + 2K{\i - /|). 
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4. Testing for covariance structures 

The asymptotic distribution given in Theorem [2] has several statistical appli- 
cations. One of them is in high dimensional covariance matrix regularization, 
because Theore m implies a unifor m convergence rate for all sample covari- 
ances. Recently, ICai and Liul (l201ll ) explored this direction, and proposed a 
thresholding procedure for sparse covariance matrix estimation, which is adap- 
tive to the variability of each individual e ntry. Their method is supe rior to the 
uniform thresholding approach studied by iBickel and Levinal (|2008bl l. 

Testing structures of covariance matrices is also a very important statistical 
problem. As mentioned in the introduction, when the data dimension is high, 
conventional tests often cannot be implemented or do not work well. Let E„ 
and Rn be the covariance matrix and correlation matrix of the random vector 
{Xn,i.i)i<i<m respec tively. Two types o f tests have been studied under the large 
n, l arge m paradigm. [Chen et al.l (|2010l ). lBai et all (|2009l ). lLe"doit and Woll (|2002l ) 
and IJohnstond (|200lF considered the test 



^^0 • — In 



(6) 



and lLiu et aP (|2008l ) . ISchot^ (|2005l ) . ISrivastaval (|2005l ) and|jiang| (|2004l ) studied 
the problem of testing for complete independence 



Ho '■ Rn ~ In 



(7) 



Their testing procedures are all based on the critical assumption that the entries 
of the data matrix X„ are i.i.d., while the hypotheses themselves only require 
the entries of (^n,i.i)i<i<m to be uncorrelated. Evidently, we can use M„ in 
((2) to test (O, and we only require the uncorrclatedncss for the validity of the 
limiting distribution established in Theorem [2l as long as the mild conditions 
of the theorem are satisfied. On the other hand, wc can also take the sample 
variances into consideration, and use the following test statistic 



— max - — ' '-^^ 

l<i<j<m. ^Tn,i,j 

to test the identity hypothesis dl]), where (Jn,i,j = I{'i' = .?'}. It is not difficult 
to verify that has the same asymptotic distribution as M„ under the same 
conditions with the only difference being that we now have to take sample 
variances into account as well, namely, the index set X„ in Section [2] is redefined 
as In = {(«, j) : 1 < i < j < m}. Clearly, we can also use M'^ to test i/o : 2]„ = 
YP for some known covariance matrix E". 

By checking the proof of Theorem[2l it can be seen that if instead of taking the 
maximum over the set X„ = {(i, j) : 1 < i < j < m}, we only take the maximum 
over some subset An C In whose cardinality \An\ converges to infinity, then the 
maximum also has the Gumbel type convergence with normalization constants 
which are functions of the cardinality of the set A„. Based on this observation, 
we are able to consider three more testing problems. 
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4.1. Test for stationarity 

Suppose we want to test whether the population is a stationary time series. 
Under the null hypothesis, each row of the data matrix X„ is distributed as a 
stationary process {Xi)Ki<m- Let 7/ = Goy{Xq, Xi) be the autocovariance at 
lag I. In principle, we can use the following test statistic 



Tri — max 



l<i<j<ni 

The problem is that 7/ arc unknown. Fortunately, they can not only be esti- 
mated, but also be estimated with higher accuracy 

^ n n 

ln,l ~ {Xn,k,i-\l\ — (in){Xn,k,i — fl-n), 

nm ^-^ ^-^ ..II 

k=l i=\l\ + l 

where /!„ = {1/nm) J2k=i X^illi Xn.k.i, and we are lead to the test statistic 

Ta = max ' 

l<i<j<ni ^Tn,i,3 

Using similar arguments of Theorem 2 of lXiao and (|201ll ). under suitable 
conditions, we have 



max |7„.( - 7(1 = Op{\J\o%mlnm). 

0</<m— 1 

Therefore, the limiting distribution for A/„ in Theorem [2] also holds for T„. 
4.2. Test for handedness 

In time series and longitudinal data analysis, it can be of interest to test whether 
E,„ has the banded structure. The hypothesis to be tested is 

Ho: an,^,J =Oii\i-j\> B, (8) 



where B = Bn may depend on n. ICai and Jiang! (j201ll ) studied this problem 



under the assumption that each row of the data matrix Xn is a Gaussian random 
vector. They proposed to use the maximum sample correlation outside the band 



as the test statistic, and proved that r„ also has the Gumbel type convergence 
provided that _B„ = o(m) and several other technical conditions hold. 

Apparently, our Theorem [5] can be employed to test ([S]). If all the conditions 
of the theorem are satisfied, the test statistic 

^P I f^TJ. .7,_ i I 

Iri = max 



has the same asymptotic distribution as Mn as long as Bn = o[m). Our theory 
does not need the normality assumption. 
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4.3. Assess the tapering procedure 

Banding and tapering are commonly used regularization procedures in high 
di mensional covariance matrix estimation. Convergence r ates were first obt ained 
bv lBickel and Levinal (|2008al ). and later on improved by ICai et al.l (|2010l ). Let 
us introduce a weaker version of the latter result. Suppose each row of Xn is 
distributed as the random vector X = (^i)i<i<m with mean /i and covariance 
matrix E ~ i'^ij)- Let Kq,K and t be positive constants, and 'i^riiKo, K,t) be 
the class of 7Ti-dimcnsional distributions which satisfy the following conditions 

max IcTijI < Aa--(i+'') for all fc; (9) 

\i-j\=k 

Amax(S) < Kq] 

P [\v'^iX - n)\ > x] < e"*"^'/^ for all a; > and ||v|| = 1; 

where Aniax(S) is the largest eigenvalue of E. For a given even integer 1 < B < 
m, define the tapered estimate of the covariance matrix E 

where the weights correspond to a flat top kernel and are given by 

when \i-j\ < Bn/2, 
2\i ~ j\/Bn, when B„/2 <\i- j\ < B„, 
otherwise. 

Theorem 5 (|Cai et all . l2010f ). If m > nVlS'v+i)^ logm = o{n) and B„ = 
^i/CSjy+i)^ t/ien there exists a constant C > such that 



supE 



A(E„_B^ — E) 



We see that it is the parameter 77 that decides the convergence rate under the 
operator norm. After such a tapering procedure has been applied, it is important 
to ask whether it is appropriate, and in particular, whether ^ is satisfied. We 
propose to use 



T„ ~ max 



'n,i,j I 



as the test statistic. According to the observation made at the beginning of 
Section m if the conditions of Theorem [2] are satisfied, then 



T' = 



max 



\i-j\>B„ V^mJ 

has the same limiting law as M„. On the other hand, (jH]) implies that 
max =0f7i-(i+'')/(2'7+i)) 

\i-j\>B„ '-^ \ J 

SO Tn has the same limiting distribution as T/^ if we further assume logr?T, 
o(r72/(4';+2)). 
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5. Proof 



The proofs of Theorem [5] under (A4) and (A4') are very similar, and they share 
a common Poisson approximation step, which we will formulate in Section l5.ll 
under a more general context, where the limiting distribution of the maximum 
of sample means is obtained. Since the proof under (A4') is more involved, we 
provide the detailed proof under this assumption in Section 15.21 and point out 
in Section [575] how it can be adapted to give a proof under (A4). 



5.1. Maximum of Sample Means: An Intermediate Step 

In this section we provide a general result on the maximum of sample means. 
Let Yn = (Yn,k,i)i<k<n,iGiri be a data matrix whose n rows are independent 
and identically distributed, and whose entries have mean zero and variance one, 
where X„ is an index set with cardinality = s„. For each z S X„, let yi be 
the i-th column of Yn, yi = J2k=i Yn,k,i- Define 

W^„ = max|y,|. (10) 

Let E„ be the covariance matrix of the s„-dimensional random vector (yn,i,i) iein ■ 

Lemma 6. Assume S„ satisfies either (Bl) or (B2) of Section \6.1\ and log = 
o[n}/^). Suppose there is a constant C > such that Yn,k,i & -^(l, Ctn) for each 
1 < fc < n, i € In, with 

_ Vn(5n 

(l0gS„)3/2' 

where ((5„) is a sequence of positive numbers such that 6n = o(l) and (log Sn)^ /n — 
o{Sn), and the definition of the collection S§{d,T) is given in {21^ . Then 

lim P inWl - 21ogs„ + log(logs„) + logTr < z) = cxp (-er'^^A . (11) 

Proof. For each z G M, let z„ = a2s„z/2 + 62s,,- Let {Zn.i)iei-n be a mean 

zero normal random vector with covariance matrix For any subset A = 

{zi,i2,...,irf} Cl„,letyA = ^/n{yi^,yi.^, . . . ^iji^Y and = (Z^^ , , . . . , Z^J. 
By Lemma [HI wc have for 0^ — 5n /vlog^ that 



P{\yA\. > Zn) < P{\Za\, > Zn - On) + Crf CXp - ^ ^ , " . 3,, 

t Crfd„(logs„) ■^/^ 
< P{\Za\, >zn- On) + Cdexp{-(logs„)5-i/2} 

Therefore, 

J2 Pi\yA\,>Zn) 

AcX^,\A\=d 

< ^'(I^aI. >2„-^n) + CrfS^exp{-(logs„)5-i/2| 

AcT„.\A\=d 



H. Xiao, W.B. Wu/ Simultaneous Inference of Covariances 



12 



Similarly, we have 

E Pi\yAU>zn) 

AcIn,\A\=d 

AcXn,\A\=d 

Since (z„±0„)^ = 2 log s„ — log(log s„) — log 7r + z + o(l), by Lemma [71 we know 

g-d2/2 



lim V P{\Za\,> zn±e„) = 

n— >oo ^ — ^ ^ 

A(lX„,\A\=d 

and hence 

g-dz/2 

lim 2^ P(|2/^|. > 2„) = 

AcI„,|A|=d 

The proof is complete in view of Lemma |9l □ 
5.2. Proof under (A4' ) 

We divide the proof into three steps. The first one is a truncation step, which will 
make the Gaussian approximation result Lemma |8] and the Bernstein inequality 
applicable, so that we can prove Theorem [5] under the assumption that all the 
involved mean and variance parameters are known. In the next two steps we 
show that plugging in estimated mean and variance parameters does not change 
the limiting distribution. 

Step 1: Truncation For notational simplicity we let g = p/(4 + 2p). Define 

Xn,k,^ = Xn,k4 [\Xn,k,^ \ < n^/^^+^P) } , (12) 

and define M„ similarly as M„ with Xn_k.i being replaced by its truncated 
version Xn,k,i- Since logm = o{n'^), we have 

n m 

fc=i 1=1 
< nmlCn{t,p) exp 

= ICn{t,p) exp{—tn'' + logm + logn} = o(l). 

Therefore, in the rest of the proof, it suffices to consider Xn,k,i- For notational 
simplicity, we still use X„^k,i to denote its centered version with mean zero. 
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Define an,i,] = E (^^„^i_jX„_i , and f„^,j = Var (^Xn,i,iXn,i,j^ ■ Set 

1 



M„^i = max — r: 

l<i<j<m y/: 



Mn 2 = max 



1 



1< 



i<j<m 



k=l 
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Elementary calculations show that 



max Iffn.ij- — (T„_i.j I < Ccxp {— , and (13) 

l<z<j<m ' ' ' 

Cov(l„,„,l„,;3)-Cov(X„,„,X„,;3) < Cexp{-inV2}. (14) 



max 



By (HH). wc know the covariance matrix of (^n.a)aei„ satisfies either (Bl) or 
(B2) if E„ satisfies (Bl) or (B2) correspondingly. On the other hand, we have 
by elementary calculation that there exists a constant Cp > such that 

limsup max Eexp{Cpi|X„^Q|P/^} < oo. 
It follows that when < p < 2, for each integer r > 3 

r(l-p/2) 



< (4n2/(4+2p)y^' ''^'^!(Cpi)-'^Ecxp{Cpi|X„,„|f/2}. 



Therefore, 



1,C- 



,2p/(4+2p) 



When 2 < p < 4, it is easily seen that EoX„^q, G ^(1, C). Since logm = o{n'^), 
we know all the conditions of Lemma |6] arc satisfied, and hence 

lini P (nMl , - 4 log m + log(log m) + log(87r) <y) ^ exp {-e~ylA . (15) 

Combining ([T^ and (|14p . we know the preceding equation (|15p also holds with 
M„_i being replaced by M„^2- 



S'iep ^; Effect of Estimated Means Set X„^i = (1/^) Define 
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In this step we show that ([T5|) also holds for A/„^3. Observe that 

\X X l - ( ^ 

\Mn,3-Mn,2\< max — ";LJ!l^ < max |X„.ip • min f^^ij 

l<i<j<m x/Tnij l<i<m yl<i<j<m 

Since each Xn,k,i is bounded by 2n^^^'^^^P\ by Bernstein's inequality we have 
for any constant K > 0, 



max P |X„,,| > 2KJ^- < Ccxp ^ 



i<i<m \' \l n J - ^[ Cn + 2X\Anog7^ • 2ni/(4+2p) 

and hence 



l<i<m 

which together with ()14p implies that 



max |X„,,| =0p I ) , (16) 



|M„,3 - A'/„.2| = Op = op ' ' 



71 loa; m 



Therefore, ([T5|) also holds for AIn,3. 

Step 3: Effect of Estimated Variances Denote by dn,i,j the estimate of cfn,i,j 

1 " . 



n 

k=l 



In the definition of A/„, t„ is unknown, and is estimated by 
1 " 



n 

k=l 



In this step we show that holds for M„. Since 



< nMl ^ ■ ^ max |1 - f„,^,j/f„,ij|, 



it suffices to show that 



max |f„,ij - fn,i^j \ = Op (1/ log m). (17) 

l<z<j<m 

Set 

1 " 

fc=l 

^n,ij,2 — ^ ^ (Xji kjiXji kJ ^n,ij 



n 

k=l 



H. Xiao, W.B. Wu/ Simultaneous Inference of Covariances 



15 



Observe that 



^n,i.j,l '^n,i,j — {^n.i,j ^n,i,j^ 



which in together with ([T5|) imphes that 

max Ifn^j.j,! - fn^i,j I Op (log m/ n) . (18) 

l<z<j<m 

Note that Xn,k,i,j ai'C uniformly bounded according to the truncation (|12p. so 
By Bernstein's inequahty, we have 

< exp(-nVlOO), 

and it follows that 

max \fn,i.j,2 - Tn,i.j\ = Op{n^'^). (19) 

l<?<j<m 

In view of ([T5|) . ([T^ . and the assumption logrn = o{n'^), we know to show ((T7)) . 
it remains to prove 

max \fn^ij^i-fn,i,j,2\=op{l/logm). (20) 

l<i<j<rn 

Elementary calculations show that 

max \fn,i,j,i - T„,ij,2| < 4/i^ i/i„,2 + 3/i^ ^ + Ah]llh]llhn.i + ikn^sh^ 



where 



/in,i = max \Xn,i 

l<i<m 



1 " ~ 

i«,2 = max -"^Xlk., 

l<t<m n ^ — ' 



hn,3 
hnA 



A;^l 




1 . 


max 




l<i<_;'<m 


n 

1 


'^n.i,j.2 • 





1 " , 

^ ^ Xn,k,iXn^}ij f^n,2,j 
" fc=l 



By (fT6)) . we know h„^i = Op{^\ogm/n). By (fT9|) we have hnA = Op(l). 
Combining (|12p and the Bernstein's inequality, we can show that 



/in, 3 = Op (^■\J\ogm/nj 
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As an immediate consequence, we know ft,„_2 = Op{l). Therefore, 
max \fn,i.j,i - T„,ij,2| = Op [^y\ogm/n) , 

l<i<j<m \ / 

and holds by using the assumption logm = o{n'^) = o(?i^/'^). The proof of 
Theorem [5] under (A4') is now complete. 

5.3. Proof under (A4) 

We follow the proof in Seetion r5.21 and point out necessary modifications to make 
it work under (A4). If not specified, all the notations have the same definitions 
as in Section [5^ For notational simplicity, we let p = 4(1 + q) + S. 

Step 1: Truncation We truncate Xn^k,i by 
then 

P (m„ ^ M„) < nmMn{p)n~P/'^{\ogn)P < CMn{p)n-^^^{\ogn)P = o(l). 

Therefore, in the rest of the proof, it suffices to consider Xn,k,i- For notational 
simplicity, we still use Xn,k,i to denote its centered version with mean zero. 
Elementary calculations show that 

max |(T„_,;.j - cr,i,j;.j| < Gn~'^P~'^'>/^{\ognY~'^ , and 

l<z<j<m ' ' 

(21) 

max Cov(X„,„,X„,^)-Cov(X„.„,X„.^) < Cn-(f-4)/4(logn)P-4. (22) 

By (PT|) . we know the covariance matrix of {Xn.a)aei„ satisfies either (Bl) or 
(B2) if En satisfies (Bl) or (B2) correspondingly. Since 

Eol„,aG^[l,8V^/(logn)2], 

we know all the conditions of Lemma [5] are satisfied, and hence (|15p holds for 
M„_i. Combining (PT|) and (|22p . we know ([T5|) also holds with if we replace M„ i 
by A/„^2- 

Step 2: Effect of Estimated Means Using Bernstein's inequality, we can show 



max \Xn^i\ = Op 

l<2<m 



log n 



which impUes that 

|M„,3 -M„,2 1 = Op ' ^ 
and hence (fT5|) also holds for Af„ 3. 
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Step 3: Effect of Estimated Variances It suffices to sliow tliat 



max |f„,ij - Tn,i,j\ = op(l/ logn). (23) 

l<i<j<m 



Using we know 



max \fn,ij,i - f„,^j| = Op (logn/n) . (24) 

l<i<j<m 



Since 



By Corollary 1.6 of H agae 3 (Il979l ) (witli X = ??y(Iogn)^ and y ~ n/[2(Iogi 



in tlieir inequality (1.22)), we have 



max P(|f„,i.j,2 - f„,j.j| > (logn) < 

l<i<j<ni 



< 



Cn 



n(Iogn)-2 • [n(logn)-3/2]'3Ai 
C(Iogn)5T'°«^" 



logn 



and it follows that 



max \fn,^,J,2 - Tn,i,j \ = Op [(log n) ^1 . (25) 
l<i<j<m 



In view of ([24|) . ([25]), we know to show ([23]), it remains to prove 

max |f„.ij.i - f„,ij.2| = op(l/logn). (26) 

l<z<j<m 



We know hn,i = Op{\J\ogn/n) and /i„.4 = Op{\). Using the Bernstein's in- 
equality, we can show that 

/in, 3 = Op (^s/logn/nj , 
and it follows that hn,2 = Op{l). Therefore, 

max \fn.i,j.i - ■Tn,i,j.2\ = Op { ^/\og n/n) , 

l<?<j<m \ / 

and ([26]) holds. The proof of Theorem [2] under (A4) is now complete. 



6. Some auxiliary results 

In this section we provide a normal comparison principle and a Gaussian ap- 
proximation result, and a Poisson convergence theorem. 
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6.1. A normal comparison principle 

Suppose for each n > 1, {Xn,i)i^i^ is a Gaussian random vector whose entries 
have mean zero and variance one, where I„ is an index set with cardinahty 
Let Sn = {rn,i,j)i,j£Xr, be the covariance matrix of (X„^i)igx„ ■ Assume 
that s„ — )- oo as n — >■ oo. 

We impose either of the foUowing two conditions. 

(Bl) For any sequence such that 6„ — > oo, 7(71, 6„) = o (l/log6„) ; 
and hmsup7„ < 1. 

(B2) For any sequence (b„) such that 5„ — > 00, ^{n,bn) = o(l); 

E ''n.ij' " ^ i^n^^) some S > 0; and hmsup7„ < 1. 



where 



"i{n,hn) ■■= sup sup inf \rn,i,j\ 
and 7„ sup \r„,i.j\. 



Lemma 7. Assume either (Bl) or (B2). For a positive real number Zn, define 

= {l^n..|>^n} Cind Q;_,= p{{^Ki]- 

A<Zlr,,\A\=d \ieA ) 

I] Zn satisfies that = 2 log s„ — loglog s,i — log7r + 2z + o(l), then for all d > 1. 

^. — dz 

lim Qn.d = 



dl 



Lemma [7] is a refined version of Lemma 20 in IXiao and Wul (|201l[ ) , so we 



omit the proof and put the details in a supplementary file. 

Remark 2. The conditions imposed on 7(71, b„) seem a little involved. We have 
the following equivalent versions. Define 

Gn{t) = max ^ I{\rn,i,j \ > t}. 

Then (i) 7(71,, 6„) = o(f) for any sequence 6„ — 00 if and only if the sequence 
[G„(t)]„>i is bounded for all t > 0; and (ii) 7(77, &„) (log 6„) ~ o(l) for any 
sequence &„ — > 00 if and only if Gn{tn) = cxp{o(l/t„)} for any positive sequence 
{t„) converging to zero. 
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6.2. A Gaussian approximation result 

For a positive integer d, let *8d be the Borel cr-field on the EucHdean space 
R'^. For two probabihty measures P and Q on (K'^,*8d) and A > 0, define the 
quantity 

Tr{P, Q; A) = sup {max [P{A) - Q {A^) , Q{A) - P {A^)] } , 
where is the A-ncighborhood of A 



A^ := ixeR" : inf |x - y| < A 

For T > 0, let S§{d, t) be the collection of rf-dimcnsional random variables 
which satisfy the multivariate analogue of the Bernstein's condition. Denote by 
(x, y) the inner product of two vectors x and y. 

^{d, r) = is a random variable : = 0, and 

\E[{C,t)^{^,uy^~^]\< ^mlT^^~^u\r-'E[{^,t)'] (27) 
for every m = 3, 4, . . . and for all i, u G R"^} . 



The f ollowing Lemma on the Gaussian approximation is taken from iZaitsev 
( 198<t ). 

Lemma 8. Let t > 0, and ^1,^2, ■ ■ ■ G R** be independent random vectors 
such that e £§(d, t) for i ~ 1,2, . . . ,n. Let S* = ^1 + ^2 + ■ • • + Cn, and ^{S) 
he the induced distribution on R''. Let $ be the Gaussian distribution with the 
zero mean and the same covariance matrix as that of S. Then for all X > 

^[J^(5),$;A] <ci,rfexpf— ^ 



where the constants cj^d, J = 1, 2 may be taken in the form Cj^d ~ Cjd^^^ . 



6.3. Poisson approximation: moment method 

Lemma 9. Suppose for each n > 1, {An.i)iex„ is a finite collection of events. 
Let Ia„ i be the indicator function of An^i, and Wn = X^iGi-^-Ani- ^'^'^ each 
d> 1, define 

Qua = 

AGl„,\A\=d \ieA 

Suppose there exists a A > such that 

lim Qn,d ~ X'^/dl for each d> 1. 

n— ^00 ' 

Then 

lim P{Wn = fc) = \^e-^/k ! for each k>0. 

n—>-oo 
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Observe that for each d> 1, the rf-th factorial moment of Wn is given by 

E [Wn{Wn -l)---iWn-d+l)]=d\- Qn,d, 

SO Lemmainiis essentially the moment method. The proof is elementary, and we 
omit details. 
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In this document we give the proofs of Lemma 3, Lemma 4 and Lemma 7 of the main article. 

Proof of Lemma 3. Assmne X; has mean z ero and varian c e one . Let 7^ = E(XoXfc) be the autocovariance 
of lag k. Then by Proposition 8, Eq. (34) of IXiao and Wul (j201lh . we know 



\lk\ < *2-*2(|fc|). 



(S.l) 



(i) Since ^'4 < cx), wc know for any ?/ > 0, there exists a iVi > such that \jk\ < V when k > Ni. For 



j < k, define X 



of 



.), where (e-)j6Z is an i.i.d. copy of (e,; 



By Eq. (38) 



Xiao and Wul (j201ll ). we know there exists a iV2 > such that when k > N2, \\Xk — Xk\\4 < f]. Set 



N = max{7Vi, 7V2}, when k > N, wc have 

Var(XoX,) = E{X',Xi) -jI=E {X^Xl^) + E [X'„{Xi - Xl^)] - 7? 
>l-if -2\\Xa\\l-V- 

Therefore, (Al) holds because rj can be arbitrarily small, 
(ii) We need to show that 

sup GoTiXoX,,XkXi) <1. 

j>Q. 0<k<l. (Oj)#(fe,0 

It suffices to show that for some > 



sup Cor(AoA„AfcAi) < 1. 

j>0, 0<k<l, (nj)^(k,l),j+k+l>N 



If j + + ^ > N , then the set {0, j, fc, 1} can be partitioned into two non-empty subsets Bi and B2 whose 
distance is no less than A^/6. Wc only consider this type of partitions. If there is a partition such that 



one of Bi and B2 has cardinality one, then similarly as (i), we know for any 77 > 0, when N is large 
enough, 

I CoYiXoX,,XkXi)\ = \E{XoX,XkXi) - < V- 

If for any partition both Bi and B2 has cardinality two, there are two sub-cases, (a) j < fc < / and 
k — j > N/6. For any ?7 > 0, when N is large enough, we have 

I CoY{XoXj,XkXi)\ = |E [XoX,{XkXi - Xk^jXi^j)]\ < r,. 

(b) min{j, 1} — k > N/6. As in (i), for any rj > 0, when N is large enough, we have Va.r{XoXj) > 1 — 77, 
Yar(XkXi) > 1 — ry, and |7j7;-fc| < tj. On the other hand, the condition \I>4 > guarantees that the 
process is non-deterministic, and hence 7 := supj>]^ |7t| < 1. It follows that when N is large enough 

\E{XoX,XkXi)\ = \E{XoX,^kXkXi^k) + nXoXk{XjXi - X,,fcX,,fe)]| 
< 7 + 77. 

Therefore, 

\CoY{XoXj,XkXi)\ < (7 + 2ry)/(l-r7) < 1 

when 77 is small enough. The proof of (ii) is now complete, 
(iii) We first consider (A3). Note that 

CoY{XtXj, XkXi) = Cum{Xt,Xj,Xk,Xi) +ji-kjj-i +-fi-ijj-k, 

where Cum{Xi, Xj, Xk, Xi) is the fourth order joint cumulant of the random vector {Xi, Xj,Xk, Xi)^ . 
Fix a subset for any integer 6 > 0, there are at most 86^ subsets {k,l} such that {k.l} C 

B{i; b) U B{j; b), where B{x; r) is the open ball {y : \x — y\ < r}. For all other subsets {fc, I}, by (|S.ip . 
we have 

|7,-fe7j-i +7.-i7i-fc| < C*4(&)- 



On the other hand, using similar arguments as Theorem 21 of IXiao and Wul (j201l[ ). we can show that 

\Cnm{X,,X,,Xk,Xi)\ < C*4(LV2J). 

Therefore, if 4*4 (fc) = o(l/logA;) as — >■ 00, then (A3) holds. 
Now we turn to (A3'). Write 

CoyiX,X,,XkXi) = EiX,X,XkXi) - j.^.jk^i. 

By (jS.l[) . it is easily seen that 

l<.i,j.k.l<.m 



It then suffices to show 



l<i<j<k<l<rn 



which is true because by Eq. (38) of IXiao and Wul (|201l[ ) 



[EiX,X,XkXi)f = [EiX,X,Xk{Xi - Xi^kW < 12||Xo|!^[*4a - fc)]' 



The proof of Lemma 3 is now complete. 
We now give the proof of Lemma 4. 



□ 



Proof of Lemma 4- Suppose (Yi, I2, ^3, Y4) has a joint normal distribution. We can write Yi ~ ctj Z, where 
Z is a four dimensional standard Gaussian random vector. For any < < 1, define the subset of M}^, 

Di, = {{aj ,aj ,aj ,aj) : |aip = 1 and \aj aj] < 1 - for 1 < i 7^ j < 4.} 

Since | Cor(Yiy2, ^3^)1 is a continuous function on D^, and D^, is compact, the maximum correlation is 
attained at some point in 13^. 

On the other hand, elementary calculation shows that Cor(liF2, = 1 if and only if ^1,1^2,^3, Y4 are 

all perfectly correlated. The proof is now complete. □ 



The proof of Lemma 7 is a refined version of that of Lemma 20 in IXiao and Wul (120111). We nee d the 



Xiao and Wul (|201lh 



following bounds on normal tail probabilities, which are taken from Lemma 19 of [ 

Denote by (pd{{rij); xi, . . . ,Xd) the density of a c?-dimensional multivariate normal random vector X = 
{Xi, . . . with mean zero and covariance matrix (rjj), where we always assume r.^ = 1 for 1 < i < c? 

and (rij) is nonsingular. Let 



ipd {{rij),xi, ... ,Xd) dxd ■ ■ ■ Axi 



Lemma S.l. For every z > 0, < s < 1, d > 1 and e > 0, there exists positive constants Cd and ed such 
that for Q < e < ed 



1. if \rij I < e for all 1 < i < j < d, then 



Cde]z 



}d{{rij)\z) < C'dfdi^A/z) cxp 



where f2k{x,y) ^Y!i=ox''y'^^'' '^^ and f2k-i{x,y) ^ Y!i=o ^^v'^''^ ^ /or fc > 1; 
2. if for alll<i<j<d+l such that {i,j) ^ (1,2), < e, then 

Qd+i {{nj);z) < Cexp (- (i]_hl\l±A Cde) z' 



(S.2) 



(S.3) 



We first give a one-sided version of Lemma 7 and its proof, then we show how it implies Lemma 7. 
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Lemma S.2. Assume either (Bl) or (B2). For a positive real number 2„, define the event An.i and Qn.d as 
And = {Xn,i > Zn} and Qn,d = 



X„AA\=d \ieA 



ACI„,\A\ = 

If Zn satisfies that z'^ — 21ogs„ — log log s„ — log(47r) + 2z + o{l), then for all d > 1 



lim QnM = 



d\ 

Proof. The following facts about normal tail probabilities are well-known: 

P{X, >x)< ^^e--'/2 for X > and lim w„^^fA~^j 2m = 1' (^-4) 



By the assumption on z„, if for each n, Xn,i, i £ In are i.i.d., then by (|S.4 



\im Qn,d = \im [ ^]QdiId,Zn) 



M 1 r dz?A e-'^'- 

1™ , ,^ ^ cxp ■ 



n— >-C 



>oo (27r)'^/2z,^ 2 J d! ■ 

When the X„.i's are dependent, the result is still trivially true when d = \. Now we deal with the d > 2 
case. Suppose (6„) is a sequence of positive numbers which converges to infinity. For each subset J of 2„ 
with cardinality \J\ = d, we define an undirected graph ?#(J) by identifying each i € J with a node and 
saying i and j arc adjacent if |r„_,;.j| > 7(71,, 6„). Suppose the graph ^^(J) has d~ s connected components 
Bl, . . . ,Bd-s- If s > 1, assume w.l.o.g. that \Bi\ > 2. Pick fco,fci e Bi, and fcp € for 2 < p < d — s, 
and set K = {ko, ki, fc2, . . . , kd-s}- Define Qj = P{r]kejAk) and Qk similarly, then Qj < Qk- By (|S.3|) of 
Lemma IS.ll there exists a number M > 1 depending on d and the sequences (7„) and (&„), such that when 
n> M, 

Qk < Cd-s cxp |- (^(i^7!i)l±^ _ Cd-slin, 6„)) zl 

< Cd_^ exp I - ( — ^ + ^ I 2„ 

Note that = 2 log .s„ — log log s„ + 0(1). Pick 6„ = [s"J for some a < (1 — jn)'^ /3d. For any 1 < a < d — 1, 

since there are at most O (6°sfj^") subsets J C X„ such that | J| = d and the graph ^(L) has d — a connected 

components, we know the sum of Qj over these J is dominated by 

r /1 , , 2(d-l)(l-7„)2 2(l-7„)2 
Cd-a cxp <^ log Sn [d-aj -\ — (d - a) 



t ° " V 3d ' ' 3 

when n is large enough, which converges to zero. Therefore, it remains to consider all the subsets J <Z Tn 
such that the graph J) has no edges 

Let J C Tn be a subset such that |J| = d, and Ij'n.ijI < "/{n,bn) for all pairs i,j such that i, j G J and 
i 7^ J, and J7{d, &„) be the collection of all such subsets. Let {rij)i,ji=j be the d-dimensional covariancc matrix 
of X J := {Xn,i)i£j. There exists a matrix Rj = 9{rij)i,j^j + (1 — 0)ld for some < 6* < 1 such that 

Qj - Qd{Id,Zn) ^ ^^[Rj]Zn]rhl. 

— or hi 

h,iej,h<i 



Let Rh, H = J \ {h, I}, be the correlation matrix of the conditional distribution of X h given Xh and Xi. 
By (|S.2p of Lemma IS.ll for n large enough 



drill ' I 1 + \rn,hd 

< CCd-2fd-2{l{n, &„), l/zn) exp I - 



1 + r, 



■ hM 



X exp <j - ( ^ - 2Cd-2l{n, 6„) ) (1 - ii{n, bn)fzl 



< Cdfd~2{l{n,bn),l/Zn) 

X exp I - - (2Cd-2 + 3(rf- 2))7(n,6„) - \rn,hA ) 

< Cd/d-2(7('^>^n)7l/2:n)cxp |~ ~ Cd7(r7,, 6„) ) 



It follows that 



\Qj-Qd{h;zn)\ 



< Cdfd-2{l{n, bn), 1/Zn) 



X 

./ej"(d,6„) iJeJ; 



(S.5) 



< Cdfd-2i'yin,bn),l/Zn)s° 

Y 1^ ~ C'd7('^,^n)^ ^1 K,t,j\, 



X 

where the sum jei over all the pair (i, j) such that | < jin, bn). Under the assumption (Bl), we 
have 

Y \Qj - Qd{ld.;zn)\ 
JeJ{d,b„} (S.6) 

< Cdfd-2h{n, 6„), l/z„)(logs„)''/^7(n, &„) exp {Cd-f{n, &„)(logs„)} 
Since lim„_).oo "/{n, bn) log 6„ — 0, it also holds that lim„_i.oo "/{n, bn) log s„ = 0. Note that lim„_i.oo(log Sn)^!"^ j z„ 
2-1/2, follows that lim„^oo !d~2{l{n, bn), l/z„)(log Sn)''/^-^ = 2-''/2+i. Therefore, the term in jHH) con- 
verges to zero, and the theorem holds under (Bl). 
Alternatively, if (B2) is true, from (jS.Sp we have 

Y \Qj-Qd{id;zn)\ 

< Cd fd-2h{n,bn), I /Zn)Sn'^ {log Sn)''^'^ ^ cxp {Cd7(n, &„) (log s„ )} | r„,i j | 

1/2 

Cd/d-2(7(",&«),l/2n)s,T^(logs„)'^/2exp{Cd7(n,6„)(logs„)} I ^ r^,^^ 



< 



< CdS-*'/2(iogs^^)exp{Cd7(",fen)(logs„)} = o(l), 
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and the proof is complete. □ 

Now we give the proof of Lemma 7. 

Proof of Lemma 7. In the proof of Theorem lS.2| the upper bounds on Qj and \Qj — Q{Id', Zn)\ are expressed 
through the absolute values of the covariances, so we can obtain the same bounds for probabilities of the 
form P{r\i<i<d{{~l)'^' Xt- > z„}) for any (oi, . . . , a^) S {0, 1}''. Based on this observation, Lemma 7 is an 
immediate consequence of Lemma IS. 21 

□ 
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